skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "White, Ethan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Aichholzer, Oswin; Wang, Haitao (Ed.)
    For fixed d ≥ 3, we construct subsets of the d-dimensional lattice cube [n]^d of size n^{3/(d + 1) - o(1)} with no d+2 points on a sphere or a hyperplane. This improves the previously best known bound of Ω(n^{1/(d-1)}) due to Thiele from 1995. 
    more » « less
    Free, publicly-accessible full text available January 1, 2026
  2. Neuro-symbolic models combine deep learning and symbolic reasoning to produce better-performing hybrids. Not only do neuro-symbolic models perform better, but they also deal better with data scarcity, enable the direct incorporation of high-level domain knowledge, and are more explainable. However, these benefits come at the cost of increased complexity, which may deter the uninitiated from using these models. In this work, we present a framework to simplify the creation of neuro-symbolic models for tree crown delineation and tree species classification via the use of object-oriented programming and hyperparameter tuning algorithms. We show that models created using our framework outperform their non-neuro-symbolic counterparts by as much as two F1 points for crown delineation and three F1 points for species classification. Furthermore, our use of hyperparameter tuning algorithms allows users to experiment with multiple formulations of domain knowledge without the burden of manual tuning. 
    more » « less
    Free, publicly-accessible full text available December 1, 2025
  3. Tanentzap, Andrew J (Ed.)
    The ecology of forest ecosystems depends on the composition of trees. Capturing fine-grained information on individual trees at broad scales provides a unique perspective on forest ecosystems, forest restoration, and responses to disturbance. Individual tree data at wide extents promises to increase the scale of forest analysis, biogeographic research, and ecosystem monitoring without losing details on individual species composition and abundance. Computer vision using deep neural networks can convert raw sensor data into predictions of individual canopy tree species through labeled data collected by field researchers. Using over 40,000 individual tree stems as training data, we create landscape-level species predictions for over 100 million individual trees across 24 sites in the National Ecological Observatory Network (NEON). Using hierarchical multi-temporal models fine-tuned for each geographic area, we produce open-source data available as 1 km2shapefiles with individual tree species prediction, as well as crown location, crown area, and height of 81 canopy tree species. Site-specific models had an average performance of 79% accuracy covering an average of 6 species per site, ranging from 3 to 15 species per site. All predictions are openly archived and have been uploaded to Google Earth Engine to benefit the ecology community and overlay with other remote sensing assets. We outline the potential utility and limitations of these data in ecology and computer vision research, as well as strategies for improving predictions using targeted data sampling. 
    more » « less
  4. Ecological Dynamics and Forecasting' is a semester-long course to introduce students to the fundamentals of ecological dynamics and forecasting. This course implements paper-based discussion to introduce students to concepts and ideas and R-based tutorials for hands-on application and training. The course material includes a reading list with prompting questions for discussions, teachers notes for guiding discussions, lecture notes for live coding demonstrations, and video presentations of all R tutorials. This course material can be used either as self-directed learning or as all or part of a college or university course. Individual learners have access to all of the necessary material - including discussion questions and instructor notes - on the website. The course focuses on papers with an open-access or free-to-read version where possible, though some materials still rely on access to closed-access papers. The course is structured around two sessions per week, with most weeks consisting of a one hour paper discussion session and a 1-2 hour session focused on applications in R. R tutorials use publicly available ecological datasets to provide realistic applications. Because the material is organized around content themes, instructors can modify and remix materials based on their course goals and student levels of background knowledge. These course materials have been taught for several years at the authors’ university and have also generated significant online engagement with course videos tens of thousands of times. 
    more » « less
  5. A substantial increase in predictive capacity is needed to anticipate and mitigate the widespread change in ecosystems and their services in the face of climate and biodiversity crises. In this era of accelerating change, we cannot rely on historical patterns or focus primarily on long-term projections that extend decades into the future. In this Perspective, we discuss the potential of near-term (daily to decadal) iterative ecological forecasting to improve decision-making on actionable time frames. We summarize the current status of ecological forecasting and focus on how to scale up, build on lessons from weather forecasting, and take advantage of recent technological advances. We also highlight the need to focus on equity, workforce development, and broad cross-disciplinary and non-academic partnerships. 
    more » « less
    Free, publicly-accessible full text available November 8, 2025
  6. Weinstein, Ben (Ed.)
    # Individual Tree Predictions for 100 million trees in the National Ecological Observatory Network Preprint: https://www.biorxiv.org/content/10.1101/2023.10.25.563626v1 ## Manuscript Abstract The ecology of forest ecosystems depends on the composition of trees. Capturing fine-grained information on individual trees at broad scales allows an unprecedented view of forest ecosystems, forest restoration and responses to disturbance. To create detailed maps of tree species, airborne remote sensing can cover areas containing millions of trees at high spatial resolution. Individual tree data at wide extents promises to increase the scale of forest analysis, biogeographic research, and ecosystem monitoring without losing details on individual species composition and abundance. Computer vision using deep neural networks can convert raw sensor data into predictions of individual tree species using ground truthed data collected by field researchers. Using over 40,000 individual tree stems as training data, we create landscape-level species predictions for over 100 million individual trees for 24 sites in the National Ecological Observatory Network. Using hierarchical multi-temporal models fine-tuned for each geographic area, we produce open-source data available as 1km^2 shapefiles with individual tree species prediction, as well as crown location, crown area and height of 81 canopy tree species. Site-specific models had an average performance of 79% accuracy covering an average of six species per site, ranging from 3 to 15 species. All predictions were uploaded to Google Earth Engine to benefit the ecology community and overlay with other remote sensing assets. These data can be used to study forest macro-ecology, functional ecology, and responses to anthropogenic change. ## Data Summary Each NEON site is a single zip archive with tree predictions for all available data. For site abbreviations see: https://www.neonscience.org/field-sites/explore-field-sites. For each site, there is a .zip and .csv. The .zip is a set 1km .shp tiles. The .csv is all trees in a single file. ## Prediction metadata *Geometry* A four pointed bounding box location in utm coordinates. *indiv_id* A unique crown identifier that combines the year, site and geoindex of the NEON airborne tile (e.g. 732000_4707000) is the utm coordinate of the top left of the tile.  *sci_name* The full latin name of predicted species aligned with NEON's taxonomic nomenclature.  *ens_score* The confidence score of the species prediction. This score is the output of the multi-temporal model for the ensemble hierarchical model.  *bleaf_taxa* Highest predicted category for the broadleaf submodel *bleaf_score* The confidence score for the broadleaf taxa submodel  *oak_taxa* Highest predicted category for the oak model  *dead_label* A two class alive/dead classification based on the RGB data. 0=Alive/1=Dead. *dead_score* The confidence score of the Alive/Dead prediction.  *site_id* The four letter code for the NEON site. See https://www.neonscience.org/field-sites/explore-field-sites for site locations. *conif_taxa* Highest predicted category for the conifer model *conif_score* The confidence score for the conifer taxa submodel *dom_taxa* Highest predicted category for the dominant taxa mode submodel *dom_score* The confidence score for the dominant taxa submodel ## Training data The crops.zip contains pre-cropped files. 369 band hyperspectral files are numpy arrays. RGB crops are .tif files. Naming format is __, for example. "NEON.PLA.D07.GRSM.00583_2022_RGB.tif" is RGB crop of the predicted crown of NEON data from Great Smoky Mountain National Park (GRSM), flown in 2022.Along with the crops are .csv files for various train-test split experiments for the manuscript. ### Crop metadata There are 30,042 individuals in the annotations.csv file. We keep all data, but we recommend a filtering step of atleast 20 records per species to reduce chance of taxonomic or data cleaning errors. This leaves 132 species. *score* This was the DeepForest crown score for the crop. *taxonID*For letter species code, see NEON plant taxonomy for scientific name: https://data.neonscience.org/taxonomic-lists *individual*unique individual identifier for a given field record and crown crop *siteID*The four letter code for the NEON site. See https://www.neonscience.org/field-sites/explore-field-sites for site locations. *plotID* NEON plot ID within the site. For more information on NEON sampling see: https://www.neonscience.org/data-samples/data-collection/observational-sampling/site-level-sampling-design *CHM_height* The LiDAR derived height for the field sampling point. *image_path* Relative pathname for the hyperspectral array, can be read by numpy.load -> format of 369 bands * Height * Weight *tile_year*  Flight year of the sensor data *RGB_image_path* Relative pathname for the RGB array, can be read by rasterio.open() # Code repository The predictions were made using the DeepTreeAttention repo: https://github.com/weecology/DeepTreeAttentionKey files include model definition for a [single year model](https://github.com/weecology/DeepTreeAttention/blob/main/src/models/Hang2020.py) and [Data preprocessing](https://github.com/weecology/DeepTreeAttention/blob/cae13f1e4271b5386e2379068f8239de3033ec40/src/utils.py#L59). 
    more » « less
  7. BackgroundForecasting the responses of natural populations to environmental change is a key priority in the management of ecological systems. This is challenging because the dynamics of multi-species ecological communities are influenced by many factors. Populations can exhibit complex, nonlinear responses to environmental change, often over multiple temporal lags. In addition, biotic interactions, and other sources of multi-species dependence, are major contributors to patterns of population variation. Theory suggests that near-term ecological forecasts of population abundances can be improved by modelling these dependencies, but empirical support for this idea is lacking. MethodsWe test whether models that learn from multiple species, both to estimate nonlinear environmental effects and temporal interactions, improve ecological forecasts compared to simpler single species models for a semi-arid rodent community. Using dynamic generalized additive models, we analyze time series of monthly captures for nine rodent species over 25 years. ResultsModel comparisons provide strong evidence that multi-species dependencies improve both hindcast and forecast performance, as models that captured these effects gave superior predictions than models that ignored them. We show that changes in abundance for some species can have delayed, nonlinear effects on others, and that lagged, nonlinear effects of temperature and vegetation greenness are key drivers of changes in abundance for this system. ConclusionsOur findings highlight that multivariate models are useful not only to improve near-term ecological forecasts but also to ask targeted questions about ecological interactions and drivers of change. This study emphasizes the importance of jointly modelling species’ shared responses to the environment and their delayed temporal interactions when teasing apart community dynamics. 
    more » « less
    Free, publicly-accessible full text available January 1, 2026